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Abstract 

This paper studies the validity of nonparametric tests used in the regression discontinuity 
design. The null hypothesis of interest is that the average treatment effect at the threshold in 
the so-called sharp design equals a pre-specified value. We first show that, under assumptions 
used in the majority of the literature, for any test the power against any alternative is bounded 
above by its size. This result implies that, under these assumptions, any test with nontrivial 
power will exhibit size distortions. We next provide a sufficient strengthening of the standard 
assumptions under which we show that a novel test in the literature can control limiting size. 
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1 Introduction 

The nonparametric literature on the regression discontinuity design (RDD) is characterized by the 
nonparametric identification of parameters at the threshold. In this paper we study constructing 
tests for these parameters, for which numerous alternatives are present in econometrics - see, for 
example, McCrary (2008), Erandsen et al. (2012), Calonico et al. (2014) and Otsu et al. (2015) for 
such tests, and see Lee and Lemieux (2010) and Imbens and Lemieux (2008) for recent surveys on 
the literature. In particular, we focus on the null hypothesis that the average treatment effect at 
the threshold in the sharp design equals a pre-specihed value. 

*I am grateful to Ivan Canay for his valuable guidance and suggestions. I thank the Co-Editor, two anonymous 
referees, Matias Cattaneo, Joel Horowitz, Pedro Sant’Anna and Max Tabord-Meehan for their helpful comments. 
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When testing this null hypothesis in simulation studies (not reported), we observe that available 
tests fail to control the rejection probability under some null distributions with practical sample 
sizes. This failure occurs for distributions that satisfy the typically imposed assumptions, and in 
turn makes us question the reliability of current inference procedures. Here we hence formally study 
the construction of valid tests for our null hypothesis. As stated in Section 3, the aim is to ideally 
construct a finite sample valid test, which requires the finite sample control of size, i.e. the null 
rejection probabilities. Since nontrivially achieving this may be too demanding, one may aim to 
approximate this finite sample goal in large samples through two different definitions of asymptotic 
validity. The first termed uniform asymptotic validity requires limiting control of null rejection 
probability uniformly across distributions under the null, whereas the second termed pointwise 
asymptotic validity requires such control to hold for each fixed distribution under the null. As 
highlighted in Remark 4.3, current tests are shown to only satisfy the second definition, which may 
not provide any guarantee on the control of finite sample size. The practical importance of the 
distinction in these definitions has also been previously noted in various other econometric applic¬ 
ations - see, for example, Mikusheva (2007) and Mikusheva (2012) for unit roots in autoregressive 
models, Romano and Shaikh (2008) and Andrews and Guggenberger (2009b) for moment inequality 
models, Leeb and Potscher (2005) and Andrews and Guggenberger (2009a) for post model selection 
and Dufour (1997) and Mikusheva (2010) for weak instrumental variable models. 

Our first result establishes that, under standard assumptions in the basic setup, any test for 
our null hypothesis of interest will have power against any alternative bounded above by its size. 
This implies that it is impossible to construct nontrivial finite sample valid tests and uniformly 
asymptotically valid tests under these assumptions. Intuitively, this result occurs because the 
assumptions permit a set of possible distributions that is ‘too large’, in a sense made precise 
in Lemma 3.1. This causes distributions under the null and alternative to be ‘arbitrarily’ close 
making it impossible to distinguish them given the data. Our goal through this impossibility 
result is not to criticize current nonparametric tests but to attempt to caution researchers using 
them. Such nonparametric tests are often viewed as appealing as they only require imposing mild 
regularity assumptions. We hope to convey that these assumptions however allow the permitted set 
of distributions to be arbitrarily large resulting in misleading inference. To recover reliable inference, 
the researcher would then naturally need to strengthen the assumptions to further restrict the 
permitted set of distributions. To this end, our second result illustrates a sufficient strengthening 
of the standard assumptions under which the Calonico et al. (2014) test is uniformly asymptotically 
valid. Our stronger assumptions are analogous to the ones commonly required for optimality results 
in nonparametric estimation; see, for example, van der Vaart (1998, Chapter 24). 

In addition to the literature on RDD, this paper is also related to the growing one in eco¬ 
nometrics on the testability of hypotheses. Bahadur and Savage (1956) was the initial paper to 
demonstrate the impossibility of constructing nontrivial valid tests for the mean of a distribution. 
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Romano (2004) extended this result to provide sufficient conditions to examine the testability of 
hypotheses in different settings. The key insight is formalizing the notion of closeness of the set of 
null and alternative distributions using the total variation metric. In econometrics, Canay et al. 
(2013) verified one of these conditions to establish impossibility of constructing nontrivial valid 
tests for some hypotheses in nonparametric models with endogeneity. In this paper we verify the 
same condition, restated as Lemma 3.1 here, to prove our impossibility result. Alternatively, Gug- 
genberger (2010a,b) used a direct approach of considering sequences of distributions under the null 
to show limiting size distortions in the Hausman pretest. For further examples of such impossibility 
results see Lehmann and Loh (1990), Leeb and Potscher (2008) and Muller (2008), and for a review 
of such results in econometrics see Dufour (2003). 

The remainder of the paper is organized as follows. Section 2 describes the basic RDD setup, 
where we introduce the notation, the commonly imposed assumptions and the null hypothesis of 
interest. Section 3 states our testing problem. Section 4 illustrates our main results. 


2 Basic RDD Setup 

Assume there are random variables {Y (0), T(l), Z) ~ Q G Q, where Q is a set of distributions on a 
sample space W = TxTxZCRx R X R such that Z contains a neighbourhood of zero. Here, let 
y(0) denote the potential outcome under treatment zero, T(l) denote the potential outcome under 
treatment one, and Z denote an observed predetermined characteristic. The observed random 
variables from the experiment are {Y, Z) ~ P G P, where P is a set of distributions on a sample 
space A’ = Tx2^CRxR. The observed outcome is determined by 

Y = A-Y{l) + il-A)-Y{0), (1) 

where treatment assignment follows a normalized threshold rule of the form 

A = 1{Z>0}. (2) 

Since 

(T,Z) = M(y(0),T(l),Z) , (3) 

where M : W —)• A is the mapping implied by (1), we have that P = QM~^ and 

P = {QM-i : Q G Q} , (4) 

where M~^ is the pre-image of M. Let = {(1^(0), 1^(1), Zj) : 1 < i < n} denote an i.i.d 

sample from Q, and let = {{Yi,Zi) : 1 < i < n} denote the corresponding observed i.i.d 

sample from P. Further, let P^ denote the n-fold product be. the joint distribution of 

the observed data. 
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We next illustrate the standard assumptions and the resulting set of possible distributions Q, 
which plays a fundamental role in our analysis. In order to do so, we introduce further notation. Let 
Q) = Eq\Y{{))\Z = z\ and ^+{z, Q) = Eq\Y{1)\Z = z\, and, whenever Q) and /r+(-, Q) 
have the appropriate level of differentiability, let fE_{z,Q) = cEfi-{z,Q)/dz^ and fi'^{z,Q) = 
(Efi+{z,Q)/dz'". Further, let a‘^{z,Q) = VarQ[Y{0)\Z = z] and a‘^{z,Q) = VarQ\Y{l)\Z = z], 
and let fgiz) denote the density of Z. Using this notation, let 

Q = {Q G Qvv ■ Q satisfies Assumption 2.1} , (^5) 

where Qw denotes the set of all Borel probability measures on W that have a density on Z 
with respect to the Lebesgue measure, and Assumption 2.1 is stated below. Assumption 2.1, in 
particular, captures the commonly imposed restrictions in the majority of the nonparametric RDD 
literature; see, for example, Calonico et al. (2014) and Imbens and Kalyanaraman (2012). 

Assumption 2.1. Let Q be such that there exist real numbers k{Q) > 0, L[Q) > 0 and U{Q) > 
0 where for all z G {—k{Q), k{Q)), i.e. in a neighbourhood around the threshold, the following 
conditions hold true. 

(i) fqiz) is continuous and L{Q) < fgiz) < U{Q). 

(ii) Eq [Y{t)Y\Z = z]< U{Q) and Eg [Y{1)^\Z = z] < U{Q). 

(Hi) fi-{z,Q) and /r+(z, Q) are 3 times continuously differentiable, and \ffL{z,Q)\< U{Q) and 
\hHz,Q)\< U{Q) for V = 1,2,3. 

(iv) a‘f{z,Q) anda\{z,Q) are continuous, and L{Q) < a‘f{z,Q) < U{Q) and L{Q) < a‘^{z,Q) < 
U{Q). 

In this setting, our parameter of interest is the average treatment effect (ATE) at the threshold, 

0(g) = /r+(O,Q)-/i_(O,Q) . (6) 

The above parameter is identified, as shown by Hahn et al. (2001), using the distribution of the 
observed random variables by 

6{P) = lim g,{z,P)— lim fi{z,P) , (7) 

z-!>0+ ^^-0- 

where ix{z,P) = Ep[Y\Z = z]. The hypotheses of interest can then be stated as 

Hq : P G Pq versus Hi : P G Pi = P \ Pq , (8) 

where Pq = {P G P | 9{P) = 0o} is the subset of P for which the null hypothesis that the ATE at 
the threshold equals a pre-specified value of Oq holds. 

Remark 2.1. To be concise, we focus on the ATE in the so-called sharp RDD (characterized by 
the treatment assignment rule in (2)). Our results in Section 4 will however follow with some 
manipulation for other parameters such as quantiles, and for parameters in other designs such as 
the kink RDD in Card et al. (2015) or the fuzzy RDD. ■ 
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3 Testing Problem 


The testing problem we study is to ideally construct a finite sample test cj) = for (8). A 

requirement of the test is that it controls size, which is said to be level a whenever 

sup Epn [(/>]<«, (9) 

PePo 

where a G (0,1) is the chosen level of significance. Note that the above is a finite sample require¬ 
ment, and to construct nontrivial tests that control size in finite samples may be too demanding. 
Alternatively, we also study the construction of a sequence of tests {(f)ri}^=i that are required to 
control limiting size, i.e. 

limsup sup Epn < a , (10) 

n->cx) PePo 

and are said to be uniformly asymptotically level a. As highlighted in Remark 4.3, this requirement 
is in contrast to the one for pointwise asymptotically valid tests where (10) is not required to hold 
uniformly across distributions in Pq. 


In our results, we show that under the commonly imposed setup described in the previous sec¬ 
tion, it is impossible to construct nontrivial tests that satisfy (9) or sequence of such asymptotically 
nontrivial tests that satisfy (10). We achieve this by illustrating that (8) has the property such 
that for any test (j) the power against any alternative is bounded above by its size, i.e. 


sup Epn [(^] < sup Epn [(/)] . (11) 

PePi PePo 

To prove this claim, we rely on an insightful result from Romano (2004) restated in the following 
lemma for clarity, where 


r(P,P') 


sup 



( 12 ) 


denotes the total variation metric between any two distributions P and P'. This lemma additionally 
formalizes the concept of what we mean by P (and hence Q) being large in some sense. 


Lemma 3.1. Let re > 1 and 4> be any test o/Pq versus Pi in (8). If for every P € Pi there exists 
a sequence {Pk}^^i in Pq such that t{P, Pk) —)• 0 as A: —>■ oo, then 


sup Epn [(f)] < sup Epn [(j)] . (13) 

PePi PePo 


4 Main Results 

4.1 Testability in the Basic Setup 

In the following theorem we establish that when Q is defined as in (5), any test for (8) will have 
power against any alternative bounded above by its size. 
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Theorem 4.1. Let n > 1, Q be defined as in (5), P be as in (4) and Pq and Pi be as in (8). 
Then any test (j) satisfies 


sup Epn [(p] < sup Ep^^ [ip] . (14) 

PePi PePo 


Proof. Fix P G Pi and take any strictly positive sequence {€k}^i such that —)■ 0 as A: —)■ oo. 
Since (4) implies that P = QM~^ for some Q G Q, it then follows from Assumption 2.1 (i) that 
for every k there exists a Borel set Pfc in A, 

Bk = {{y, z) £ X : z £ {-ik, h)] , (15) 

where > 0, such that 0 < P{Bk) < Take next any P' £ Pq that has the same density on Z 
as P. We may then construct the sequence {Pk}^=i such that for every Borel subset B oi X let 

Pk{B) := P{B n B^) + P\B n Bk) , (16) 

where B'p denotes the complement of Bk- One can verify that for every k that Pk is a well dehned 
distribution. 


Next, we show that {Pfcj^i is in Pq, i.e. for every k there exists Qk ^ Q such that 6{Qk) = Go 
and Pk = QkM~^. To construct this Qk, first note that P = QM~^ and P' = Q'M~^ for some 
Q G Q and Q' G Q with 6{Q') = Gq. Then for every Borel subset P of W let 

Qk{B)-.= Q{BPBl) + Q'{BPBk) , (17) 

where 

Bk = M~^{Bk) = {{yQ,yi,z) G W : z G (-efc,efc)} , (18) 

and Bp, denotes the complement of Bk, which in this case is just M~^{Bp). Analogous to Pk, 
it follows that Qk is a well defined distribution. To show that Qk £ Q, first note (17) ensures 
that Qk{A) = Q'(A) for every Borel subset A of W that satishes A C Bk- This implies that the 
density and all the conditional on Z = z quantities in Assumption 2.1 are equal for Qk and Q' for all 
z G (—ik, 7fc). In turn, it follows that Qk satisfies Assumption 2.1 by taking n^QQ = min{K(Q'), Cfc}, 
L{Qk) = L{Q'), and U{Qk) = U{Q'). Further, by the same argument, it follows that G{Qk) = Gq as 
G{Q') = ^ 0 - Finally, given that for every B <Z X and B = M~^{B) we have BnBp = M~^{B D Bp) 
and B n Bk = M~^{B n Bk), we can establish Pk = QkM~^ from (16) and (17). 


To conclude, we show that the total variation distance between P and Pk goes to 0 as /c —)• oo. 


T{P,Pk)= sup 


[ gdP - [ gdPk 

= sup 

/ gdP- [ gdP' 

J J 


J J Bjf; 


< 

[ dP 

+ 

[ dP' 


JBk 


JBk 


< 2efc —)• 0 , 


(19) 


where the second and fourth relations follow from (16) and (15) respectively, along with noting that 
P and P’ have the same density for Z. Since P G Pi was chosen arbitrarily, we can then invoke 
Lemma 3.1 to conclude the proof. ■ 
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Remark 4.1. In invoking Lemma 3.1 to prove Theorem 4.1, for any P E Pi we construct a 
sequence in Po) such that for every k there exists a Borel set in X with positive probability 

under Pk where P and Pk differ, and are otherwise equal on the complement of this set. Letting 
the probability of this set vanish with k implies that T{P,Pk) —)■ 0 as fc —>• oo. Further, since 
Assumption 2.1 only requires conditions local to zero that are pointwise in nature, we formally 
show in the proof that this ensures that our construction {Pk}'^i falls in Pq. Note that our 
construction is not unique and that multiple others are possible. ■ 

Remark 4.2. It is important emphasize that Theorem 4.1 is not a criticism of a specific test but 
holds for any choice of test. Furthermore, it is a statement on the finite sample property of any 
test, but with important asymptotic implications. To be specific, for any sequence of tests {4’n}’^=i 
with nontrivial limiting power, it directly follows from (14) that 

limsup sup Epn [(j)n\ > a . (20) 

n->oo PePo 

This additionally implies that if the sequence of tests is pointwise consistent in power, i.e. pointwise 
power converges to one, then limiting size is in fact equal to one. ■ 

Remark 4.3. Currently used tests are shown to be only pointwise asymptotically valid, i.e. 

limsupFlpn [4>n] < a for all P G Pq , (21) 

n^oo 

which does not say anything about whether this sequence of tests approximates (9) for 

large enough n. To be specific, it is possible that for every n > 1 there exists P G Pq such that 

Ep'n[(f>n] >> Oi . (22) 


4.2 Uniformly Valid Test under Stronger Assumptions 

In this section, we ask under what alternative assumptions we can construct an uniformly asymp¬ 
totically valid test. We consider, in particular, a natural strengthening of Assumption 2.1 leading 
to the following alternative definition of the set of possible distributions, 

Q = {Q G Qvv • Q satisfies Assumption 4.1} , (^^) 

where as before Qw denotes the set of all Borel probability measures on W that have a density 
on Z with respect to the Lebesgue measure, and Assumption 4.1 is stated below. Note that if 
Q satisfies Assumption 4.1 then it satisfies Assumption 2.1, and hence the definition of Q in (23) 
generates a smaller set of distributions than the definition of Q in (5). 

Assumption 4.1. Let Q be such that it satisfies Assumption 2.1 with k{Q) = R, L{Q) = L and 
U{Q) = U, where R > 0, L > 0 and U > 0 are real numbers that do not depend on Q. 
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We next briefly describe a simple version of the Calonico et al. (2014) test (referred to as CCT 
hereafter), which is demonstrated to satisfy (10) under this smaller set of distributions. For the 
null hypothesis in (8), the CCT test statistic is 


J,CCT(^H) 


On — Of 


0 


Sn 


(24) 


where On is a bias corrected local linear estimator of 0{P), and Sn is a plug-in estimator of a novel 
standard error formula that accounts for the variance of the bias estimate. The bias is estimated 
using a local quadratic estimator. Furthermore, for all estimates, we use the triangular kernel and 
a deterministic sequence of bandwidth choices denoted by hn- Then, the CCT level a test is 


^CCT(^(n)) ^ 


iif|r,f^^(xW)| 

0 otherwise 


(25) 


where Zi_ai 2 is the (1 — a/2)-quantile of the standard normal distribution. 


The following theorem demonstrates that under the alternative definition of Q in (23), the 
test statistic in (24) for (8) asymptotically converges uniformly in Pq to the standard normal 
distribution. It then directly follows that the test in (25) is uniformly asymptotically level a, and, 
in fact, has limiting size equal to a. 


Theorem 4.2. Let Q he defined as in (23), P be as in (4) and Pq and Pi be as in (8). If 
nhn —>■ oo, hn ^ 0 and nhn —)• 0, then the CCT test statistic from (24) satisfies 

4 ^(0,1) (26) 
Sn 

as n ^ oo, where are i.i.d Pn and Pn is any sequence of distributions such that Pn G Pq for 
all n > 1. This in turn implies that {(fin‘^'^'\^=i (25) is uniformly asymptotically level a, and, 

in fact, has limiting size equal to a, i.e. 

limsup sup = a . (27) 

n^oo PePo 


The proof of the above essentially requires slightly altering the proof of the pointwise result in 
Calonico et al. (2014) to any sequence of distributions Pn such that Pn G Pq for all n > 1. For 
completeness, we provide a proof in the supplement appendix. 

Remark 4.4. Note that when Q is defined as in (23) the arguments used to prove Theorem 4.1 do 
not go through. In particular, the constructed sequence {Pk}’^=i in (16) will not fall in P, as the 
corresponding in (17) will not fall in Q. To see why, note that for large enough k we have 

K{Qk) < h, and either pi-{z, Qk) or p.+ {z, Qk) is discontinuous at either 2; = —K,{Qk) or 2; = K{Qk)- 
This implies Qk will not satisfy Assumption 4.1 as fa-{z,Qk) or ij.p{z,Qk) will not be continuous 
for all 2; G Intuitively, Assumption 4.1 excludes extreme sequences such as {Qk}^=i for 

which nonparametric tools work poorly to give a uniform limit result. For recent additional results 
on uniform testing in RDD, see Armstrong and Kolesar (2016) and Calonico et al. (2016). ■ 
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A Additional Notation 

Let ZO) = {Zi : 1 < i < n} denote the observed sample of the random variable Z. Let a„ ^ bn denote 
a„ < Abn, where a„ and bn are deterministic sequences and A is a positive constant uniform in P. Let | • | 
denote the Euclidean matrix norm. As we use the notion of convergence in probability under the sequence 
of distributions let A„ = op„(l) denote 

Rid An I > e) — >■ 0 as n — >■ oo , 

for a sequences of random variables A„ ~ where e is any constant such that e > 0. Further, in Table 1 
below, we introduce additional notation to keep our arguments concise. 


H{hn) 

diag(l,/i-h/i-p 

{Zijhn) 

(i,zyR,(zpR)2)' 

Zn {bn) 

{r{Zi/hn),. ■ .,r{Zn/hn)y 

k{u) 

(1 - U)l{0 < M < 1} 

K{u) 

k{—u)l{u < 0} + k{u)l{u > 0} 

Kh^{u) 

K{u/hn)/hn 


1 


W+,n{hn) cliag(l{Zi > 0}Kh^ (Zi),..., 1{Z„ > OjK^jZ^)) 

r+,„(/i„) Z„(/i„)'W+,„(/i„)Z„(/i„)/n 

Sn{hn) 

^+,n ^n{^n) + ^n{J^n)^nij^n)/ 

e (1,0,0)' 

^i[z,P) Ep[Y\Z = z\ 

^i+{P) \iTa.^^Q+ ^i{z,P) 

fJ.-{P) fi{z,P) 

^P{z,P) d^fj.{z,P)/dz^ 

^J.liP) n'"{z,P) 

cr2(z,P) Varp[Y\Z = z] 

E„(P) diag((T2(Zi, P),..., a2(Z„, P)) 

^'+,„(/i„, P) ^„(/i„)'W^+,„(/i„)S„(P)lF+,„(/i„)Z„(/i„)/n 
Y„ (yi,...,y„)' 

/3+,„ P(/i„)r;;„(/i„)Z„(/i„)'W^+.„(/i„)Y„/n 


Table 1: Important Notation 


Next, we provide an extended description of the test statistic used. For our null hypothesis as stated in 
the paper, the test statistic can be rewritten as 


n 


A+,« + A-,n - id+{P) - ^J-{P)) 


(A-1) 


where fJ,+ {P) — fJ—{P) = do, A+,n and A-,n are bias corrected local linear estimates of IJ.+ {P) and ^i-{P) 
and 



where Y+,n and are plug-in estimated conditional on variances of A-r.n and A-.n! see (C-13) 
for the plug-in estimator used. The bias of both estimates are estimated using local quadratic estimators. 
Furthermore, for all estimates, we use the triangular kernel, i.e. k(u) in Table 1, and a deterministic 
sequence of bandwidth choices denoted by /i„. Throughout this document, we provide results for quantities 
with subscript (-f) as arguments for those with subscript (—) follow symmetrically. In addition, as noted 
in Calonico et al. (2014a, Remark 7), we exploit the fact that in our simple version of the test statistic the 
estimates are numerically equivalent to those from a non bias corrected local quadratic estimator. In turn, 
we can write 



which reduces the length of the proof presented below. Further, as stated in the paper, note that 


Q = {<5 G Qw : Q satisfies Assumption 4.1} , 


(A-3) 


and that 


P = {QM-i : Q G Ql , 


(A-4) 


where Qw) Tf ^ and Assumption 4.1 are as defined in the paper. 
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B Auxiliary Lemmas 


Lemma B.l. Let Q be defined as in (A-3), P be as in (A-4) and G P for all n > 1. If nh^ —> oo and 
hn —>■ 0, then 

(i) r+,„(/i„) = f+,„(/i„) + op„(l), where f+^n(hn) = K{u)r{u)r{uy fp^{uhn)du e [rL,r( 7 ] . 

(a) n+^ri{hn) = h+^nihn) + Op„(l), where K{u)r{u)vf fp^{uhn)du G [vl,vu]. 

(in) hn'i>+^n{hn,Pn) = 'I'+,n(^n) + op„(1), where K{ufr{u)r{uyaf^{uhn)fp^iuhn)du G 


Proof. For (i), a change of variables gives us 

1 




1 


'-n Jo 
poo 


-r- Y. > 0}iG(Z,//i„)r(Z,/h„)r(Z,//r„)' 

2=1 

oo 

K{z/h„)r{z/h„)r{z/h„y fp^{z)dz 
K{u)r{u)r{uyfp^{uhn) = f+,„(/i„) . 


Further, since h„ < k for large enough n, we have that L < /p„(z) < 17 by Assumption 4.1, which implies 
that 


and that 


Tl = L K{u)r{u)r{uydu <Tj^ nihn) <tj / K{u)r{u)r{uydu = Tij, 

Jo ’ Jo 

F;p.[|r+,„(h„) -F;p„[r+,„(/i„)]n < ^Ep^ \\1{Z, > 0}i7(Z,/h„)r(Z,//i„)r(Z,/h„)'r 

*- 

1 

= / K{uf\r{u)\^fp^{uhn)du 

P-n-n Jo 


< 


u 


Ar(u)^|r(u)|^(iu 


nh„ 

= 0{n-^hy^)=o{l) . 


‘n Jo 


It then follows by Markov’s Inequality that rp_„(/i„) = rp_„(/i„) + op^(I). Analogously, closely following 
Calonico et al. (2014b, Lemma S.A.l), we can show Lemma B.l(ii)-(iii). ■ 


Lemma B.2. Let Q be defined as in (A-3), P be as in (A-4) and G P for all n>l. If nh^ —^ oo and 
hn —t 0, then 


(%) F;p.[/i+,„|Z(”)] = ^p{Pn) + h\e!LY{hn)vp,n{hn)ii\{Pn)l^ + op^(Jn\) . 

(u) yp.[/i+,„|z(")] = n-ieT;;„(h„)4'+,„(h„,p„)r;;-„(h„)e = 'F+,„(h„,P„) . 
(in) {V+^n{hn,Pn)Y'^fi+,n- Ep^[fi+^n\Z^^)]) 4 Af(0, 1) . 
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Proof. For (i), by taking the conditional on expectation, we have 

Further, as /i„ < k for large enough n, we have by the required differentiability in Assumption 4.1 and a 
taylor expansion that 

F;p„[Y„|Z(")]/n = Z„(h„)i^(/^„)-l/3+(P„)/n + 5„(h„)4(P„)/(6n) + opjhl) , 

where l3+{Pn) = (/r+(P„), ^^(P„)/2)'. It then follows from Lemma B.l that 

Pp.[A+|^("^] = /i+(Pn) + hlerf]„ih„)u+^nihn)fJ.l{Pn)/3\ + Opjh^) . 

For (ii), a simple calculation gives us 

Vp.[A+(h„)|Z(")] = e'if(h„)F;;„(h„)Z„(iL„)'lF+,„(h„)E„(P„)lF+,„(h„)Z„(h„)r;;„(h„)P(h„)e/n2 

For (iii), first note that from Lemma B.l we have P„) = + op„(l) , where 

V+^n{hn) = {nhn)~^e'ff^^{hn)^+^n{hn)i'f]nihn)e ■ 

Then rewrite as follows 


A+.n-Pp„^[A+.n|^<"^] ^ / y+,„(h„,p„) ' 

\/y+,nihn, Pn) I l^+,n (h-n, Pn) 


1/2 


1/2 

( 14 ,„(/!„))■ e'r;i„(h„)f+,„(4)iy2^„ , (B-5) 


where 


Cn,2 = ^ ~ -^-Pn l-^i] ? 

— (p^n) r_|_^^(/ij^)^_|_^ 72 (/ij 7 ,)r_^ , and 

UJn,i = Af'^^'^ffl.^{hn)Kh^{Zi/hn)r{Zjhn)/n . 

Next note that for any a G we have that {a'uJn,ien,i : 1 < * < n} is a triangular array of independent 
random variables with Bpn [a'4] = 0 and Vp^ [a'4] = Further, this triangular array satisfies the 

Lindeberg condition. To see why, first note that by Lemma B.l we have 


\An\ > {jlhn) ^\Al\ 


(B- 6 ) 


for some value Ap £ R, which is uniform in P. We then have in addition to by Lemma B.l and a change of 
variables that 


n 


^Ep^[\u:n,^e^\^\ -< {nKf^Epr. \Kh^{Zlh^)r{Zfh^)jn 


1 = 1 

^ {nhnfn 


/ |A:(z/h„)r(z//i„)|^/p^(z)dz 


^{nhrffn ^h^^=0{{nhn) i) = o(l) 
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and hence, using the Lindeberg-Feller CLT, we have that a'^„ A A/’(0, a'a) as n —>■ oo. Since this holds for 
any a G R^, the Cramer-Wold theorem implies that A A/’(0, J 3 ) as n —>■ 00 , where /a denotes the identity 
matrix of size three. Furthermore, analogous to Vl)_(h„,P„) = C+(h„) -|- op^(l), we can show that 


F+,n(^n5 Pn) 


1 + OPr.(l) • 


(B-7) 


Further, by Lemma B.l we have that 


^+]ni^n)i'+,nihn) — I 3 + Op^(l) . 


(B- 8 ) 


Substituting the above results in (B-5) concludes the proof. ■ 


C Proof of Theorem 4.2 


Here we show only that 




A7(0,l) , 


V. 


+ .n 


since under similar arguments it will follow that 

idn,— d 


Af(0,l) , 


and then due to independence we can conclude that A/’(0, 1 ) as n —>■ oo. To this end, first rewrite 


M-t-,n _ /l+,n /^-|-(7n) / (/l^ , Py^ ) 

^ \/b+,n(h„, Pfi) y 


Step 1. We show that 


fj‘+,n ^J■+iPn) d^ 

\/bl|-,n(^n: Ri) 


To begin, first rewrite the above as 


A7(0,1) . 


/!+,„ -Pp. [/!+,„ I Z(")] / V^'Pp.[A+.n|^(”)]-Ai+(Pn) 


(C-9) 


\/b+,n(/ln, Pn) \ F+_„(h„,P„) 


f^,n (^n) 


In Lemma B.2 (hi), we showed that 


A/’(0,1) and , =l-|-op^(l) . 


\/y+,n{hn, Pn) V+,n{hn, Pn 

It then remains to show that 

f^,n(hn) 


= OP,y(l) 
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to conclude. To this end, note that by Lemma B.2, Lemma B.l and (B-6), it follows that 


V+.„(h„) 

as hn —>■ 0, nhn —>■ oo and —)■ 0. 

Step 2. We show that 

To begin note that 


= O {0{hl) + op„(l)} = O + op„(1) = op„(1) 


^-\-,n{^ni P-n) 


V, 


= 1 + ■ 


+ ,n 


nh ( V+^n{hn,Pn) “ V+^n ) = b'T ' h (/l„, P„) - • T _^^^{hn)e , 


(C-10) 


(C-11) 


where 


h P„) - ^+,„(h„)) = hZ„{hJW+,nihn) (S„(P„) - S„) W+,„(/i„)Z„(/i„)/n , (C-12) 


and 


E+,„ =diag(4_„p,...,e^_„_„) , 

such that e+^n,i = Yi — fj-+,n- Further, note that by construction, we can write 

hi F^n) T ^n,i ; 

such that i?p„[en,i] = 0 and Varp^[en,i\Z = z] = a‘^{z,Pn)- This in turn implies 

^+,n,z — ^n,i T Pn) MH-(Fn) “t” 1^+i.Pn) A+,n ■ 

We can then expand (C-12) to get the following 

n 

h (^'+,„(h„,P„) - ^+,„(h„)) =h^l{Z, > 0}{a^{Z,,P^) - el,)KhjZ,)\{Z,lh^)r{Z,lhr^yir 


(C-13) 


(C-14) 


(C-15) 


2 = 1 


=Bi,r. , (a) 
n 

- 1{Z, > 0}(m(^.,P„) - ^l+,n?KhAZ^)MZ^|hr^)r{ZJhJ|n 

2=1 

"V 

= B2,r. , (b) 
n 

-I- 2h^ l{Zi > 0}e„,i(/r(Zi,P„) - (Zj)^r(Zj//i„)r(Zi//i„)7n . 

i=l 

'-V-^ 

= 53,„ . (c) 

For quantity (a), since Assumption 2.1 (i). Assumption 2.1 (ii) and Assumption 2.1 (iv) are satisfied with 
the required uniform constants, we have by a change of variables that 

pOO 

Ep„ [|Sp„|2] 7 {nh)~^ / K{u)yr{u)ydu 

Jo 

= 0{{nh)-^) = o(l) , 
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which implies by Markov’s Inequality that Bn,i = op„(l). For quantity (b), note that first we can rewrite it 
as 


n 

> 0}{^^{Z,,Pn)-^x+{Pn)fKhAZ^)‘^r{Z,/h„)riZ,/Ky/n 


Z=1 



n 

+ {fi+{Pn) - A+.n)' ■hY.HZ, > 0}K^jZ,)\{Z,/h^)r{ZJhJ/n 
2 = 1 


= Bn^22 
n 

+ 2(/x+(P„) - A+,n) ■ h'^l{Zt > 0}{fi{Zi,Pn) - ^i+{Pn))Kh„{ZiYr{Zi/hn)r{Zi/hn)' /n , 
2 = 1 


= Bn,23 


Next, since Assumption 2.1 (i) and Assumption 2.1 (iii) are satisfied with the required uniform constants, 
we have by a taylor approximation and a change of variables that 

poo 

EpjBr,, 2 if] :< n-^h^ / K{u)‘^\r{u)\Uu 

Jo 

= 0{n-^h^) = o{l) , 


which implies by Markov’s inequality that P„_ 2 i = op„(l)- Further, since Assumption 2.1 (i) is satisfied with 
the required uniform constants, we have by a change of variables that 

pOO 

EpJ\Bn,221"^] ;< {nhy^ / A:('u)^|r('u)|‘^dM 

Jo 

= 0{(nh)-^) = o(l) , 


which implies by Markov’s inequality that Bn ,22 = op„(l)- Finally, since Assumption 2.1 (i) and Assumption 
2.1 (iii) are satisfied with the required uniform constants, we have by a taylor approximation and a change 
of variables that 

pOO 

Ep^ [|5«.23p] {n)~^h / K(uf\r{u)\^du 

Jo 

= 0{n~^h) = 0 ( 1 ) , 


which implies by Markov’s inequality that Bn, 23 = op„(l). Since ii+{Pn) — iJ+,n = op„(l) by (C-9), we can 
conclude for quantity (b) that Bn ,2 = op„(l). For quantity (c), using analogous arguments, we can conclude 
that Bn ,3 = op^(l), and hence 

h + Pn) - ^+,nihn)) = Op^ (1) . (C-16) 

In addition, since from Lemma B.l we have that it then follows that 

nh {v+,n{,hn,Pn) " V+,u) = Opjl) . (C-17) 


To conclude, first rewrite (C-17) as 


F+,n(^n;-Fn) 

ihn) 


and our result then follows from (B-7). 


0 p„(l) > 
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